Experimentation Procedure for Offloaded Mini-Apps Executed on Cluster Architectures with Xeon Phi Accelerators

نویسندگان

  • Gary Lawson
  • Vaibhav Sundriyal
  • Masha Sosonkina
  • Yuzhong Shen
چکیده

A heterogeneous cluster architecture is complex. It contains hundreds, or thousands of devices connected by a tiered communication system in order to solve a problem. As a heterogeneous system, these devices will have varying performance capabilities. To better understand the interactions which occur between the various devices during execution, an experimentation procedure has been devised to capture, store, and analyze important and meaningful data. The procedure consists of various tools, techniques, and methods for capturing relevant timing, power, and performance data for a typical execution. This procedure currently applies to architectures with Intel Xeon processors and Intel Xeon Phi accelerators. It has been applied to the Co-Design Molecular Dynamics mini-app, courtesy of the ExMatEx team. This work aims to provide end-users with a strategy for investigating codes executed on heterogeneous cluster architectures with Xeon Phi accelerators.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Understanding the Costs of Many-Task Computing Workloads on Intel Xeon Phi Coprocessors

Many-Task Computing (MTC) aims to bridge the gap between HPC and HTC. MTC emphasizes running many computational tasks over a short period of time, where tasks can be either dependent or independent of one another. MTC has been well supported on Clouds, Grids, and Supercomputers on traditional computing architectures, but the abundance of hybrid large-scale systems using accelerators has motivat...

متن کامل

Towards Modeling Energy Consumption of Xeon Phi

In the push for exascale computing, energy efficiency is of utmost concern. System architectures often adopt accelerators to hasten application execution at the cost of power. The Intel Xeon Phi co-processor is unique accelerator that offers application designers high degrees of parallelism, energy-efficient cores, and various execution modes. To explore the vast number of available configurati...

متن کامل

Optimization of stencil-based fusion kernels on Tera-flops many-core architectures

We present the optimization of kernels from fusion plasma codes, GYSELA and GT5D, on Tera-flops many-core architectures including accelerators (Xeon Phi, TeslaK20X), and CPUs (FX100). Through the optimization, we found that the structure of array (SoA) style implementation is effective for SIMD operations on all architectures, and high cache locality, which is achieved in GYSELA, is of critical...

متن کامل

A Randomized LU-based Solver Using GPU and Intel Xeon Phi Accelerators

We present a fast hybrid solver for dense linear systems based on LU factorization. To achieve good performance, we avoid pivoting by using random butterfly transformations for which we developed efficient implementations on heterogeneous architectures. We used both Graphics Processing Units and Intel Xeon Phi as accelerators. The performance results show that the pre-processing due to randomiz...

متن کامل

GEMTC: GPU Enabled Many-Task Computing

Current software and hardware limitations prevent Many-Task Computing (MTC) workloads from leveraging hardware accelerators (NVIDIA GPUs, Intel Xeon Phi) boasting Many-Core Computing architectures. Some broad application classes that fit the MTC paradigm are workflows, MapReduce, high-throughput computing, and a subset of high-performance computing. MTC emphasizes using many computing resources...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1509.02135  شماره 

صفحات  -

تاریخ انتشار 2015